Skip to content

🤖 fix: stabilize CoderProvisioner reconcile and startup#73

Merged
ThomasK33 merged 1 commit into
mainfrom
fix/coderprovisioner-reconcile-startup
Feb 13, 2026
Merged

🤖 fix: stabilize CoderProvisioner reconcile and startup#73
ThomasK33 merged 1 commit into
mainfrom
fix/coderprovisioner-reconcile-startup

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

This PR fixes CoderProvisioner bring-up and reconcile stability issues discovered in smoke testing.

Background

During an end-to-end smoke test, CoderProvisioner failed to reach Ready due to RBAC escalation issues and provisioner pod startup failures. Logs also showed repeated metadata backfill/rotation activity that could repeatedly hit coderd and trigger rate limiting when reconcile retries occurred.

Implementation

  • Added manager RBAC markers for delegated provisioner role verbs on:
    • pods (get/list/watch/create/update/patch/delete)
    • persistentvolumeclaims (get/list/watch/create/update/patch/delete)
  • Regenerated RBAC output so config/rbac/role.yaml includes those verbs.
  • Stopped coderd metadata-backfill churn when a usable key secret already exists by stamping status baseline fields locally instead of re-calling/rotating keys in that retry path.
  • Updated provisioner pod launch wiring:
    • explicit command: ["coder"]
    • args use coder provisioner start
    • removed CODER_ORGANIZATION injection for key-auth daemon startup (avoids CLI rejection when --key is used).
  • Updated controller tests accordingly.
  • Regenerated CRD/docs for the updated API comment text.

Validation

  • make verify-vendor
  • make test
  • make build
  • make lint
  • make test-integration
  • make manifests
  • make docs-reference
  • Runtime smoke verification in kind:
    • CoderProvisioner reaches phase=Ready, readyReplicas=1
    • provisioner pod logs include successfully connected to coderd

Risks

  • Low to moderate: changes are isolated to CoderProvisioner reconcile/deployment composition and generated RBAC/docs.
  • Main behavior change is intentionally avoiding metadata-backfill coderd calls when a usable key secret already exists; this reduces churn/rate-limit risk during retry loops.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $0.00

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8118ac233d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/controller/coderprovisioner_controller.go Outdated
Fixes CoderProvisioner bring-up failures and reconcile churn by:
- granting manager RBAC verbs required to create delegated provisioner Roles
- preventing coderd key metadata-backfill retries from repeatedly rotating keys
- launching provisioners with explicit command/args compatible with coder images
- avoiding CODER_ORGANIZATION injection for key-auth provisioners
- updating tests and generated CRD/docs to match behavior

---

_Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.00`_

<!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=0.00 -->
@ThomasK33 ThomasK33 force-pushed the fix/coderprovisioner-reconcile-startup branch from 8118ac2 to 2bbf959 Compare February 13, 2026 10:17
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the review feedback by restoring metadata revalidation/rotation in the empty-status branch and persisting key-related status immediately after key+secret reconciliation to avoid repeated coderd churn on later retries.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Feb 13, 2026
@ThomasK33
Copy link
Copy Markdown
Member Author

Merged via the queue into main with commit 26f9704 Feb 13, 2026
11 checks passed
@ThomasK33 ThomasK33 deleted the fix/coderprovisioner-reconcile-startup branch February 13, 2026 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant